Wherefore, what is claimed is: 



1 . A computer-implemented process for using feature selection to obtain a 
strong classifier from a combination of weak classifiers, comprismg using a comp 
5 perform tile following process actions: 

(a) inputting a set of training examples, a prescribed maximum number of 
weak classifiers, a cost ftmction capable of measuring the overall cost, and an acceptable 
maximum cost; 

(b) computing a set of weak classifiers, each classifier being associated to 
10 a particular feature of the training examples, 

(c ) determining which of the set of weak classifiers is the most significant 
classifier; 

Q (d) adding said most significant classifier to a current set of optinaal weak 

classifiers; 

t 

f4 15 (e) determining which of the current set of optimal weak classifiers is tiie 

least significant classifier; 

(f) computing the overall cost for the current set of optimal weak classifiers 
using the cost fimction; 

(g) conditionally removing the least significant classifier for the current set 
20 of optimal weak classifiers; 

(h) computing the overall cost for the current set of optimal weak classifiers 
less the least significant classifier using the cost function; 

(i) determining whether the removal of the least significant classifier results 
in a lower overall cost; 

25 (j) whenever it is determmed that the removal of the least significant 

classifier results in a lower overall cost, eliminating the least significant classifier; 

(k) recomputing each classifier in the current set of optimal weak classifiers 
associated with a feature added subsequent to the eliminated classifier while keeping the 
earlier optimal weak classifiers unchanged; 
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(1) repeat actions (f) through (k) until it is determined the removal of the 
least significant classifier does not result in a lower overall cost and then reinstating the 
last identified least significant classifier to the current set of optimal weak classifiers; 

(m) determining if the number of weak classifiers in the current set of optimal 
weak classifiers equals the prescribed maximum number of weak classifiers or the last 
computed overall cost for the current set of optimal weak classifiers is less than the 
acceptable maxknum cost; and 

(n) whenever it is determined that the number of weak classifiers in tiie 
current set of optimal weak classifiers does not equal the prescribed maximum number of 
weak classifiers and the last computed overall cost for the current set of optimal weak 
classifiers exceeds the acceptable maximum cost, repeating actions (c) through (m) until 
it is determmed that the number of weak classifiers in the current set of optimal weak 
classifiers does equal the prescribed maximum number of weak classifiers or the last 
computed overall cost for the current set of optimal weak classifiers becomes less than 
the maximum allowable cost, then outputting the sum of the individual weak classifiers 
as the trained strong classifier, 

2. The process of Claim 1 wherein the process action of computing each 
classifier of a set of weak classifiers comprises the process action of deriving each 
classifier based on a histogram of a scalar value feature for face training examples and a 
histogram of a scalar value feature for the non-face training examples, 

3. The process of Claim 1 wherein the most significant classifier includes the 
feature that is the most likely to predict whether a training example matches the 
classification of a particular classifier. 

4. The process of Claim 1 wherein the set of weak classifiers are designed to 
classify whether a training example is a face or non-face. 



32 



5. The process of Claim 1 wherein the set of weak classifiers is designed to 
classify a training example as a text type. 



6. The process of Claim 1 wherein the set of weak classifiers is designed to 
classify a teaming example as a ^e of document 

7, The process of Claim 1 wherein the set of weak classifiers is designed to 
classify a teaining example as a speech pattern. 

8. The process of Claim 1 wherein the set of weak classifiers is designed to 
classify a training example as a type of medical condition. 

9, The process of Claim 1 wherein a weak classifier h] (x) is computed as 



\og-^ + log ^ 



wherein the probability densities of a feature j for a sub-sample x of a training 
example is denoted by Pj (x \y = +\) for a sought pattern and Pj (x\y = -\) for a non- 
sought pattem and the normalized weights are denoted by w. 

1 0. The process of Claim 9 wherein the probability density for a sought pattem 
and the probability density for a non-sought pattem can be estimated using the histograms 
resulting fi-om weighted voting of the training examples. 

1 1 . The process of Claim 9 wherein the process action of determining which 
of the set of weak classifiers is tiie most significant classifier comprises defining the most 

significant classifier (jc) as, h^{x) = arg min Y , wherein 

Hi = {h] {x) I } , hix) = X^=/^m W 5 and Mis the total number of weak classifiers in 
the set of weak classifiers. 
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12. The process of Claim 9 wherein the process action of determining which 
of the set of weak classifiers is the least significant classifier comprises defining the least 
significant classifier U (x) as, arg min^^^^^ J{H^ - h) where //Mdenotes the strong 
classifier built upon the current set Ji5/of selected weak classifiers. 

13. The process of Claim 1 wherein the process action of computing the 
overall cost comprises computing the overall cost J{h{x)) as 

J{h{x)) = ^e-^'^^^^ wherein y=+l for a sought pattem and y=-l for a nonsought pattem 
and Kx,) is a weak classifier in the set of weak classifiers. 

14. The process of Claim 1 wherein outputting the sum of the individual weak 
classifiers as the trained strong classifier comprises outputting the sum H(x) as H(x) = 

^« ^^^^ wherein M is the total number of weak classifiers in the set of weak 
classifiers (x) is a weak classifier in the current set of weak classifiers. 

15. A system for detecting a person's face in an input image and identifying a 
face pose range into which the face pose exhibited by the detected face falls, the system 
comprising: 

a general purpose computing device; and 

a computer program comprising program modules executable by the 
computing device, wherein the computing device is directed by the program modules of 
the computer program to: 

create database comprising a plurality of training feature characterizations, 
each of which characterizes the face of a person at a known face pose or a non-face; 

train a plurality of detectors arranged in a pyramidal architecture to 
determine whether a portion of an mput image depicts a person's face having a face pose 
falling within a face pose range associated with one of the detectors using the training 
feature characterizations; and wherein 
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said detectors using a greater number of feature characterizations are 
arranged at the bottom of the pyramid, and 

said detectors arranged to detect finer ranges of face pose are arranged 
at the bottom of the pyramid; and wherein the program module to train a pluraUty of 
5 detectors comprises sub-modules to, 

(a) input a set of training examples, a prescribed maximum number of 
weak classifiers, a cost function capable of measuring the overall cost, and an acceptable 
maximum cost; 

(b) compute a set of weak classifiers, each classifier being associated to a 
||{ 10 particular feature of the training examples, 

# (c ) determine which of the set of weak classifiers is the most significant 

classifier; 

(d) add said most significant classifier to a current set of optimal weak 

^ classifiers: 

13 .... 

15 (e) determine which of the current set of optimal weak classifiers is the 

least significant classifier; 

(f) compute Hie overall cost for the current set of opthnal weak classifiers 
using the cost fimction; 

(g) conditionally remove the least significant'classifier for the current set of 
20 optimal weak classifiers; 

(h) compute the overall cost for tiie current set of optimal weak classifiers 
less the least significant classifier using the cost fimction; 

(i) determine \s^ether the removal of the least significant classifier results in 
a low er overall cost; 

25 0) whenever it is determined that the removal of the least significant 

classifier results in a lower overall cost, eliminate the least significant classifier; 

(k) recomputeeachclassifierinthecurrentsetof optimal weak classifiers 
associated with a feature added subsequent to the eliminated classifier while keeping 
the earlier optimal weak classifiers unchanged; 
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(1) repeat actions (f) through (k) until it is determined flie removal of the 
least significant classifier does not result in a lower overall cost and then reinstate the last 
identified least significant classifier to the current set of optimal weak classifiers; 

(m) determine if the number of weak classifiers in the current set of optimal 
5 weak classifiers equals the prescribed maximum number of weak classifiers or the last 
computed overall cost for the current set of optimal weak classifiers is less than the 
acceptable maximum cost; and 

(n) whenever it is determined that the number of weak classifiers in the 
| .^^ current set of optimal weak classifiers does not equal the prescribed maximum number of 

|| 10 weak classifiers and the last computed overall cost fb^ 
€l classifiers exceeds the acceptable maxinium cost, repeat actions (c) 

1,1. determmed that the nunaberofweak classifiers in the cur^ 

Jll classifiers does equal the prescribed maximum number of weak classifiers or tiie last 

^ computed overall cost for the current set of optimal weak classifiers becomes less than 

Ill 15 the maximum allowable cost, then output the sum of the individual weak classifiers as the 

m 

trained strong classifier. 

16. A computer-readable medixjm having computer-executable instructions for 
boosting the performance of a classifier in a statistical based machine learning system, 
20 said computer executable instructions comprising: 

identifying a set of weak classifiers each of which is associated with a 
feature found in a plurality of training examples, said weak classifiers collectively best 
classifying the training examples; 

linearly combining each of the weak classifiers m the identified set of 
25 weak classifiers to define a strong classifier, 

wherein the action of identifying the set of weak classifiers comprises 
using a sequential forward search for optimal weak classifiers with backtracking to ensure 
the inclusion of a weak classifier in the set of weak classifiers in lower overall 
performance in the form of increased processing time. 

30 



III 



36 



